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ABSTRACT: AN EVALUATION OF FITNESS REPORTS SCALES 



A sample of convenience was obtained in which 15 officers completed 
(anonymously) fitness reports on each other. Fitness report scales 
were examined to determine their quality based on the statistical 
considerations of "discrimination” and "disagreement” index. It was 
found that there was a greater spread in scores (fitness marks) when 
a number of judges rate one individual than when an average judge 
rates a number of individuals. Generalizing from the study is 
prohibited by the size and nature of the sample. The study demon- 
strates a type of analysis that can be performed and the type of 
information that can be obtained by studies of this type. Replica- 
tions of the study are recommended. 
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Problem and Background 



How can one evaluate fitness report scales? Ideally, in measuring 
the effectiveness of a scale, one would have some Objective measure- 
ment of an officer *s work performance to which fitness report scale 
marks would be related. The fitness report scales on which the marks 
more closely reflected the objective measurement of work performance 
would, of course, be the more desirable scales. But such an objective 
measurement of job performance does not exist. If it did, there would 
be no need for the fitness report scales which involve human judgment 
and all its human errors, since the objective measurement of job 
performance itself would serve any purpose for which the scales are 
used. 

Since the objective measurement of work performance does not exist, 
other bases for evaluating the scales must be used. One basis used to 
evaluate scales is to make a judgment of their relevancy. This is 
perhaps the most important aspect of any scale. Someone has evidently 
judged that the scales on the fitness report are relevant to the 
performance of officers. Otherwise they would not have been included 
on the form. 

Another basis for evaluating scales is ‘‘statistical. ** This report 
is concerned with the “statistical“ evaluation of scales. Assuming 
qualitative differences between ratees actually exist, statistically 
good scales have the following two characteristics: 

a. “ Discrimination “ between individuals, so that individuals 
are not all rated at the same level. 

b. Inter-rater reliability, or very little “disagreement" among 
raters when they are judging the same behavior. 
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These two statistical characteristics can then be used to evaluate 
fitness report scales. The information needed to evaluate the 
"disagreement" characteristic is not usually available. 

A retired Commanding Officer of a destroyer made available the 
information making evaluation on both characteristics possible. Aboard 
his destroyer he became intensely interested in the judgmental evalua- 
tion of his officers. He had each of his fifteen officers complete 
the then operational fitness report [NavPers 310 (Rev. 4-62) presented 
as the Appendix to this report] on each of the other officers. The 
reports were completed as usual except that the raters remained 
anonymous. A summary of the sample’s characteristics is presented 
in Table 1. 



TABLE 1 

Population Characteristics 









Rank 


Frequency 


USN(R) 


Frequency 


Designator 


Frequency 


ENS 


4 


USN 


10 


1100 


9 


LTJG 


8 


USN(R) 


5 


1105 


5 


LT 


0 






3100 


1 


LCDR 


1 










CDR 


2 










CAPT 


0 










Total 


15 




15 




15 



Procedure 

Points were assigned to each of the rating scales of items 14 
through 20 as indicated on the report form. Item 14 consists of 
9-point scales, items 15 and 16 consist of 5-point scales, and item 20 
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consists of 7-point scales. Comparisons among scales were restricted 
to comparisons within these three sets of scales C9, 5, and 7-point 
lengths) . 

Index numbers were generated to reflect the two statistical 
characteristics of discrimination and disagreement. Table 2 shows the 
computation procedure for obtaining the "discrimination index" and 
"disagreement" index. 

TABLE 2 

Computation of Discrimination Index and Disagreement Index 
Each scale was analyzed as follows: 



RATEES 







1. 


2. 


3. 


4. 


5. 


cr of Each Row 


R 


A 


X 


X 


X 


X 


X 




A 


B 


X 


X 


X 


X 


X 




T 
















E 


C 


X 


X 


X 


X 


X 




R 


D 


X 


X 


X 


X 


X 




S 


E 


X 


X 


X 


X 


X 


















X = Discrimination 
















^A-E Index 


a of each 


0 


0 


a 


a 


a 




CO lumn 


V 


2 


3 


4 







x^ = Disagreement Index 
1-5 



Each scale was examined individually. "Discrimination" was first 
determined for each rater. Even though the raters were not identified 
on the reports, it was possible to group reports of the same rater by 
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matching certain miscellaneous characteristics of the reports. The 
standard deviation of the marks the rater assigned for each scale 
across ratees (in Table 2 — the standard deviation of rows) was 
computed, a numerically high standard deviation indicating good 
discrimination. To obtain a discrimination index for each scale 
across raters, a simple average (mean) of these standard deviations 
was computed (in Table 2 — the average of the row standard deviations). 

An index of ’’disagreement” was also generated for each scale. The 
standard deviation of the marks the raters assigned to each ratee 
was computed (in Table 2 — the standard deviation of columns). To 
obtain a ’’disagreement index” across all ratees, a simple average (mean) 
of these standard deviations was computed (in Table 2 — the average 
of the column standard deviations). These disagreement indexes are 
influenced by both the relative ratings assigned by raters (i.e., 
agreement among raters in their relative ordering of ratees) and 
agreement among raters in the absolute level of their ratings — the 
aspect usually influenced by leniency error. 

Results 

The results of the analysis are shown in Table 3. Most of the 
scales that were high in ’’disagreement” were low in ’’discrimination” 
and vice versa. A lack in either low disagreement or high discrimina- 
tion reduces the utility of the scale. Five of the scales were 
relatively favorable on both the disagreement and discrimination 
scales. They are: 
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Performance - As' ( ) Watch Officer 

14 f Performance - Technical Specialty ( ) 

20k Leadership - Personal Behavior 

201 Leadership - Military Behavior 

20m Leadership - Self-expression (oral) 

They constitute the best of the scales as determined by this 
statistical analysis. 

Three of the scales were relatively poor on both disagreement and 
discrimination. They are: 

16c Foreign Duty 

20a Leadership - Professional Knowledge 

20b Leadership - Moral Courage 

The most significant finding, however, is the similarity in level 
of "discrimination” and "disagreement” indexes. Ideally, judges would 
rate an individual on a scale with perfect agreement; and, assuming 
that individual differences exist on a scale, their ratings would 
reflect the true range of individual differences on that scale. Of 
the 26 scales on the fitness report, 17 scales have disagreement values 
that numerically exceed their discrimination values. This finding 
indicates that for these 17 scales, there is a greater spread in 
scores when a number of judges rate one individual than when an 
average judge rates a number of individuals. In other words, in 
this sample the raters disagree on individuals’ ratings on a scale 
to a greater extent than average raters are able to discriminate 
among individuals on the scale. 
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TABLE 3 



Disagreement Index and Discrimination Index 
of Each Fitness Report Scale 



Item 


No. 

Title : 


Scale 

Points 


Disagreement 
Index 1 


Discrimination 
Index 2 


14 


Performance 








(a) 


Present Assignment 


9 


1.30 


1.16 


(b) 


Shiphandling and Seamanship 


9 


1.14 


1.11 


(c) 


Airmanship 


9 


— 


— 


(d) 


Collateral Duties 


9 


1.21 


.98 


(e) 


As Watch Officer 


9 


1.19 


1.16 


(f) 


Technical Specialty ( ) 


9 


1.03 


1.15 


(g) 


Command Potential or Ability 9 


1.36 


1.33 


(h) 


Administrative & Management 










Ability 


9 


1.37 


1.15 


15 


Overall Evaluation 


5 


.61 


.72 


16 


Desirability 








(a) 


Operational 


5 


.84 


.81 


(b) 


Staff or Administrative 


5 


.82 


.81 


(c) 


Foreign Duty 


5 


.84 


.78 


20 


Leadership 








(a) 


Professional Knowledge 


7 


.89 


.80 


(b) 


Moral Courage 


7 


.89 


.80 


(c) 


Loyalty 


7 


.79 


.76 


(d) 


Force 


7 


.91 


.89 


(e) 


Initiative 


7 


.89 


.88 


(f) 


Industry 


7 


.87 


.87 


(g) 


Imagination 


7 


.82 


.78 


(h) 


Judgment 


7 


.84 


.79 


(i) 


Reliability 


7 


.85 


.80 


(j) 


Cooperation 


7 


.90 


.98 


(k) 


Personal Behavior 


7 


.79 


.85 


(1) 


Military Behavior 


7 


.82 


.94 


(m) 


Self-expression (oral) 


7 


.84 


.84 


(n) 


Self-expression (written) 


7 


.73 


.73 



Notes : 



^ean standard deviation of ratings on same subjects by 
different raters. 



Mean standard deviation of ratings for different subjects 
by same raters . 
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Conclusions and Recommendations 



For this specific sample, the "statist ically desirable" qualities 
of each scale were determined and the five "statistically best" scales 
were identified. In general it was found that raters differed among 
themselves on the ratings they assigned, to the same degree that an 
average rater discriminated among ratees. Some implications of this 
general finding are that: (1) fitness marks should be interpreted as 

being highly dependent on the particular rater involved, and (2) there 
is a need for training of raters and/or better definition of scales 
so that inter-rater agreement would be increased. It is recognized 
that only one of the 15 raters in this study (the Commanding Officer) 
was a practiced rater. But since specific training in rating is not 
normally provided for officers who will be expected to complete fitness 
reports, the lack of experience in 14 of the 15 raters of this study 
may not have reduced the representativeness of this sample. 

It would be expected that ratings by peers would differ somewhat 
from ratings by superiors or ratings by subordinates. This study 
combined all three varieties Cno choice due to anonymity of raters) 
and this undoubtedly accounts for some of the non-reliability among 
raters. The accumulation of evidence from many such studies where 
raters could be identified would reveal the specific ways in which 
superiors, peers, and subordinates differ in their ratings. Statistical 
corrections could then be applied in order to obtain a better estimate 
of inter-rater reliability. 

The sample size of this study was too small to permit justifiably 
generalizing from the results. This study provides, however, a 
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demonstration of the type of analysis that can be performed and the 
type of information that can be obtained by studies of this type. 
Replications of this study within small clusters of officers who 
are familiar with each others’ job performance would permit the 
accumulation of information from which generalizations could reasonably 
be made. 
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APPENDIX 

luvPEKS 3I0M (i£y. 3-62) REPORT ON THE FITNESS OF OFFICERS WORKSHEET 



1. NAME first, middle) 


2. GRADE 


3. USN(R) 


4. DESIGNATOR 


S. FILE NUMBER 


6. SHIP OR STATION 


7. DATE REPORTED PRESENT DUTY STATION 


8. OCCASION FOR REPORT 

1 1 prniooir 1 1 DETACHMENT OF I 1 DETACHMENT 

1 1 1 1 reporting senior 1 1 OF OFFICER 


9. TYPE DF REPDRT 
1 1 'REGULAR 1 1 ^rrenT 


[ [ SPECIAL 


10. PERIOD OF REPORT 

FROM; TO: 



II. OUTIES (Li$t principal duties assigned and the number of aantA« during period for tshieh assigned) 



12. EMPLOYMENT OF COMMANO OURING PERIOD OF THIS REPORT 



13. REFERENCE HERE ANO APPENO ANY COI^ENOABLE OR AOVERSC REPORTS ON THIS OFFICER RECEIVED OURING THE PERIOO OF THIS REPORT 



14. PERFORMANCE OF DUTIES (Evaluate his perfaruanee of duty in eamparisan with ather officers af Ait grade and approximate length af service) 



DUTY ASSIGNMENT 


NOT 

OBS. 

OR 

N.A. 


Outstanding 
^ performance.. 

(9) (8) 


Excellent perform- 
ance. Frequently 
deinonstrates out- 


Very good performance, 
Frequently demon- 
strates excellent 

^§4«ce.(4) 


Satisfactory 

performance. 

Bfisically 

(3)ualified(.2) 


Inadequate perform- 
ance. He is not 
qualifi^cj. y Ad verse) 


(a) PRESENT ASSIGMWIENT 




















* 


(b) SHIPHANOLING AND SEAMANSHIP 




















* 


(c) AIRMANSHIP 




















• 


(d) COLLATERAL DUTIES 




















• 


(e) AS WATCH OFFICER 




















V 


(f) TECHNICAL SPECIALTY 1 ) 




















• 


(g) COMMANO POTENTIAL OR ABILITY 




















* 


(h) ADMINISTRATIVE ANO MANAGEMENT ABILITY 
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15. OVERALL EVALUATION: 



(a) In cempariaon with other officers of his grade and approximate length of service, how would you designate this officer? 

(b) For this report period indicate in (b) how many officers of his grade you have designated in each category of (a). 





NOT 

OBSERVED 


Che of the highly 
outstan^i^^officers 
1 know 


A very fine officer 
of grA^ Value to 
the sVrvice 


A dependable and 

typicaJiyj^fective 

officeA'^/ 


An ac^ptable 
off%2; 


l^satisfactory 
( 1 y (Adverse) 


(a> 












• 


(b) 















16. DESIRABILITY: Considering (1) the possible requirenenta of war peace, (2) this o^i^'a profeaaional and ty;hni^l co^tance, and (3) the adaptability of ^is officer to the 
varying cooditions of naval service, indicac^cw attitude toaard Iv^ii^yhia officer under your iliiyiw in the following tyM^oJ aasipmenta: ( 1 ) 





NOT 

OBSERVED 


I 

Particularly desire 


Prefer to most 


Pleased to have 


Satisfied to have 


Prefer not to have 
(Adve rse) 


(a) OPERATIONAL 












• 


(b) STAFF OR ADMINISTRATIVE 












• 


(C) FOREIGN DUTY 












« 



17 . EN TRIES ON THIS REPDRT ARE BASED ON (Check app ropr iate box) 



□ 



OAILY CONTACT ANO CLOSE OBSERVATION 



Q li 



FREQUENT OBSERVATION 



[ [ INFREQUENT OBSERVATION 



I I RECORDS ANO REPORTS ONLY 



18. FOR FUTURE ASSIGNMENTS: 

Based on your observations, for what type of duty do you consider Kim best qualified for his next assignment at sea and shore? 



SEA 



Connent, if appropriate 



I 

I 

i 

1 



SHORE 



19. NAME. GRADE. FILE NUMBER. DESIGNATOR ANO OFFICIAL TITLE OF REPORTING SENIOR. 



RAVPERS 3I0W (REV. 4-62) 



20. LEADERSHIP : In comparison with other officers of his grade and ^proximate length of duty assignment, to vdiat degree has this office^ exhibited the 

following qualities of leadership? 



DEFINITIONS 

OUTSTANDING - ONE out of 100 - Exceeds ALL others ACCEPTABLE - BELOW the maioritv 

EXCEPTIONAL - One of the next top FEW - Extraordinary MARGINAL - Barely satisfactory 

SUPERIOR - ABOVE the creat MAJORITY UNSATISFACTORY 

EXCELLENT - EQUAL to the majority 


NOT OBSERVED 


O 

o o 
— z 

li- Q 
h- 

r) 

(^1 


-J 

< 

z 

o 

K 

Q- 

UJ 

(®) 


(SUPERIOR 


1- 

z 

u 

-J 

-J 

u 

( 4 ) 


CAtCEPTABLE 


-j 

< 

z 

o 

<r 

( 1 ) 


to > 

— q: 

1 - o 
< f- 
to o 
z < 

ja^^se) 


(a) PROFESSIONAL KNOWLEDGE (Costp rehens ion of oM ospects of the profession) 


















(l)) MORAL COURAGE (To do what he ought to do regardless of consequences to himself) 


















(C) LOYALTY (His faithfulness and allegiance to his shipmates, his command, the service and the notion) 


















(d) FORCE (The positive and enthusiastic manner vitk which he fulfills his responsibilities) 


















(e) INITIATIVE (His vtningn«s« to seek out and accept responsibil ity) 


















(f) INOUSTRY (The zeal exhibited and energy applied in the performance of his duties) 


















(g) IMAGINATION (Resourcefulness, creotiveness, and capacity to plan constructively) 


















i. (h) JUOGMENT o6iIity to develop correct and logical conclusior^s) 


















(i) RELIABILITY (The dependability and thoroughness exhibited in meeting responsibilities) 


















fj) COOPERATION (His ability and millingness to work in harmony mith others) 


















|k) PERSONAL BEHAVIOR (His demeanor, disposition, soeiobility ond sobriety) 


















(l) MILITARY bearing (His military carriage, correctness of uniform, «aartn«<s of appearance and physical fitness) 


















(m) SELF* EXPRESSION (ORAL) (His ability to- axpreax himself orally^ 


















(n) SELF-EXPRESSION (WRITTEN) (His ability to express himself in writing) 



















21. COMMENTS: (Reporting seniors ore encouraged to discuss this report with the officer, 6ut not necessarily shorn 

(a) Make comments regarding any strejigths, special accomplishments, contributions to the Naval and National service, or minor 
weaknesses. (Minor weaknesses must be discussed with the officer) 



Have minor weaknesses been discussed, with officer? . 






□ NOT APfLICABLE 



•(b) ADVERSE 

and 20. 
ment of 



(DMMENTS, if any. Comments in this section are mandatory for adverse or unsatisfactory marks in section 14, 
Reports containing adverse matter must be referred for statement pursuant to Art. 1701. 8, Navy Regulations, 
officer mnst be attached to this report. (Marks in starred (*) boxes are adverse.) 



15, 16 
State- 



Has officer seen 
this report? 



I I YES I I NO 



(c) What has been the 
your last report? 



trend of his performance si 



nee I I 



FIRST REPORT 



□ 



□ 



□ 



22. OATE FORWAROEO 


signature of reporting senior 


23. CONCURRENT REPORT: 




OATE FORWAROEO 


SIGNATURE OF REGULAR REPORTING SENIOR 



A- 44774 
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